Rijuban Rangslang Segment phoneme classification from speech under noisy conditions:Using amplitude-frequency modulation based two-dimensional auto-regressive features with deep neural networks

نویسنده

  • Paavo Alku
چکیده

ii

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

EMG-based wrist gesture recognition using a convolutional neural network

Background: Deep learning has revolutionized artificial intelligence and has transformed many fields. It allows processing high-dimensional data (such as signals or images) without the need for feature engineering. The aim of this research is to develop a deep learning-based system to decode motor intent from electromyogram (EMG) signals. Methods: A myoelectric system based on convolutional ne...

متن کامل

Temporal resolution analysis in frequency domain linear prediction.

Frequency domain linear prediction (FDLP) is a technique for auto-regressive modeling of Hilbert envelopes. In this letter, the resolution properties of the FDLP model are investigated using synthetic signals with impulses immersed in noise. The effect of various factors are studied which affect the temporal resolution and this analysis suggests ways to improve the resolution of the FDLP envelo...

متن کامل

Integrating Deep Neural Networks into Structured Classification Approach based on Weighted Finite-State Transducers

Recently, deep neural networks (DNNs) have been drawing the attention of speech researchers because of their capability for handling nonlinearity in speech feature vectors. On the other hand, speech recognition based on structured classification is also considered important since it realizes the direct classification of automatic speech recognition. For example, a structured classification meth...

متن کامل

Resource aware design of a deep convolutional-recurrent neural network for speech recognition through audio-visual sensor fusion

Today’s Automatic Speech Recognition systems only rely on acoustic signals and often don’t perform well under noisy conditions. Performing multi-modal speech recognition processing acoustic speech signals and lip-reading video simultaneously significantly enhances the performance of such systems, especially in noisy environments. This work presents the design of such an audio-visual system for ...

متن کامل

Recurrent Neural Networks for Noise Reduction in Robust ASR

Recent work on deep neural networks as acoustic models for automatic speech recognition (ASR) have demonstrated substantial performance improvements. We introduce a model which uses a deep recurrent auto encoder neural network to denoise input features for robust ASR. The model is trained on stereo (noisy and clean) audio features to predict clean features given noisy input. The model makes no ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016